Skip to content

AIX/XCOFF support for fibers #7338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 16, 2021
Merged

AIX/XCOFF support for fibers #7338

merged 6 commits into from
Aug 16, 2021

Conversation

NattyNarwhal
Copy link
Member

The fibers PR neglected to import the AIX-specific code from Boost's contexts library. This PR reinstates them and adds them to autoconf.

May be required for non-_CALL_ELF==2 Linux, since that shares the AIX ABI (like, pretty much every non-LE enterprise distro, i.e. EL7).

The fibers PR neglected to import the AIX-specific code from Boost's
contexts library. This PR reinstates them and adds them to autoconf.

May be required for non-_CALL_ELF==2 Linux, since that shares the
AIX ABI (like, pretty much every non-LE enterprise distro, i.e. EL7).
@NattyNarwhal NattyNarwhal marked this pull request as draft August 3, 2021 17:03
@NattyNarwhal
Copy link
Member Author

Marked as draft because I haven't tested; the PHP runtime works but I'm not familiar with fibers just yet. If it works, just unmark it as such.

@NattyNarwhal
Copy link
Member Author

I guess this is why I marked it draft:

(gdb) where
#0  0x00000001003eb67c in make_fcontext ()
#1  0x0000000100044308 in zend_fiber_init_context (context=0x700000000079440, kind=0x18027aa70, 
    coroutine=@0x1800baf40: 0x100043704 <zend_fiber_execute>, stack_size=2097152)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:25
#2  0x0000000100045014 in zim_Fiber_start (execute_data=0x430f08, return_value=0x7000000000140d0)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:18
#3  0x00000001000efcf4 in execute_ex (ex=0x430f08) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_execute.c:52634
#4  0x00000001000f1b7c in zend_execute (op_array=0x430f08, return_value=0x200000)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_execute.c:58834
#5  0x0000000100048c74 in zend_execute_scripts (type=4394760, retval=0x200000, file_count=-2146717912)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend.c:1789
#6  0x00000001001c3708 in php_execute_script (primary_file=0xa00) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/main/main.c:2517
#7  0x0000000100002e30 in do_cli (argc=2, argv=0x1800f6150) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/sapi/cli/php_cli.c:363
#8  0x00000001000008cc in main (argc=2, argv=0x200000) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/sapi/cli/php_cli.c:1367
(gdb) disassemble 
Dump of assembler code for function make_fcontext:
   0x00000001003eb670 <+0>:     mflr    r6
   0x00000001003eb674 <+4>:     rlwinm  r3,r3,0,0,27
   0x00000001003eb678 <+8>:     addi    r3,r3,-248
=> 0x00000001003eb67c <+12>:    stw     r5,176(r3)
   0x00000001003eb680 <+16>:    li      r0,0
   0x00000001003eb684 <+20>:    std     r0,184(r3)
   0x00000001003eb688 <+24>:    addi    r0,r3,232
   0x00000001003eb68c <+28>:    mr      r4,r0
   0x00000001003eb690 <+32>:    std     r4,152(r3)
   0x00000001003eb694 <+36>:    mflr    r0
   0x00000001003eb698 <+40>:    bl      0x1003eb69c <make_fcontext+44>
   0x00000001003eb69c <+44>:    mflr    r4
   0x00000001003eb6a0 <+48>:    addi    r4,r4,24
   0x00000001003eb6a4 <+52>:    mtlr    r0
   0x00000001003eb6a8 <+56>:    stw     r4,168(r3)
   0x00000001003eb6ac <+60>:    mtlr    r6
   0x00000001003eb6b0 <+64>:    blr
   0x00000001003eb6b4 <+68>:    mflr    r0
   0x00000001003eb6b8 <+72>:    stw     r0,8(r1)
   0x00000001003eb6bc <+76>:    stwu    r1,-32(r1)
   0x00000001003eb6c0 <+80>:    li      r3,0
   0x00000001003eb6c4 <+84>:    bl      0x1002affe4 <_exit>
   0x00000001003eb6c8 <+88>:    ld      r2,40(r1)
End of assembler dump.
(gdb) info reg
r0             0x3618   13848
r1             0xfffffffffffcf50        1152921504606834512
r2             0x1800c8020      6443270176
r3             0x430f08 4394760
r4             0x200000 2097152
r5             0x1800baf28      6443216680
r6             0x100044308      4295246600
r7             0x26bcfc4        40619972
r8             0x68258  426584
r9             0x700000000001070        504403158265499760
r10            0x700000000001080        504403158265499776
r11            0xb00e200000179f30       12686112384722902832
r12            0x800000000000f032       9223372036854837298
r13            0x1800f6340      6443459392
r14            0x700000000014020        504403158265577504
r15            0x70000000006e0c0        504403158265946304
r16            0xffffffffffffbf0        1152921504606845936
r17            0x800200140000000        576495942044221440
r18            0xffffffffffffed0        1152921504606846672
r19            0x6      6
r20            0x700000000079400        504403158265992192
r21            0x700000000079400        504403158265992192
r22            0x18027af10      6445051664
r23            0x1a     26
r24            0x2      2
r25            0x18027aa70      6445050480
r26            0x1800baf40      6443216704
r27            0x201000 2101248
r28            0x700000000079440        504403158265992256
r29            0x700000000230000        504403158267789312
r30            0x700000000231000        504403158267793408
r31            0x200000 2097152
pc             0x1003eb67c      0x1003eb67c <make_fcontext+12>
msr            0x800000000000f032       9223372036854837298
cr             0x22002424       570434596
lr             0x100044308      0x100044308 <zend_fiber_init_context+196>
ctr            0x0      0
xer            0x4000000        67108864

Poking at it, but I didn't spy any site-local modifications to PHP's imported files. I could also just fall back to no asm as well, like #7226 did.

@trowski
Copy link
Member

trowski commented Aug 3, 2021

To be sure I'm understanding the above correctly, it's failing on the store instruction for the context function?

@NattyNarwhal
Copy link
Member Author

Seems so. I compared the ELF version and it looks fairly similar, though the use of stw for presumably 32-bit seems sus compared to the std in ELF.

ucontext_t works for now, so this PR isn't too high priority.

@trowski
Copy link
Member

trowski commented Aug 3, 2021

It appears to be using make_ppc32_sysv_xcoff_gas.S, not make_ppc64_sysv_xcoff_gas.S. If you're expecting 64-bit, then perhaps something needs to be changed in this section to detect 64-bit PPC properly on AIX. Guessing it is matching to ppc32 currently.

@NattyNarwhal
Copy link
Member Author

It appears to be using make_ppc32_sysv_xcoff_gas.S, not make_ppc64_sysv_xcoff_gas.S.

It is 64-bit, but it's selecting the 64-bit assembly properly it seems, per comparing the two files with the compiled file:

32-bit:

#.make_fcontext:
    # save return address into R6
    mflr  6

    # first arg of make_fcontext() == top address of context-function
    # shift address in R3 to lower 16 byte boundary
    clrrwi  3, 3, 4

    # reserve space for context-data on context-stack
    # including 64 byte of linkage + parameter area (R1 % 16 == 0)
    subi  3, 3, 336

    # third arg of make_fcontext() == address of context-function
    stw  5, 240(3)

64-bit:

#._make_fcontext:
    # save return address into R6
    mflr  6

    # first arg of make_fcontext() == top address of context-function
    # shift address in R3 to lower 16 byte boundary
    clrrwi  3, 3, 4

    # reserve space for context-data on context-stack
    # including 64 byte of linkage + parameter area (R1 % 16 == 0)
    subi  3, 3, 248

    # third arg of make_fcontext() == address of context-function
    stw  5, 176(3)

And GDB:

Dump of assembler code for function make_fcontext:
   0x00000001003eb670 <+0>:     mflr    r6
   0x00000001003eb674 <+4>:     rlwinm  r3,r3,0,0,27
   0x00000001003eb678 <+8>:     addi    r3,r3,-248
=> 0x00000001003eb67c <+12>:    stw     r5,176(r3)

@trowski
Copy link
Member

trowski commented Aug 3, 2021

Oh, I missed that make_ppc64_sysv_xcoff_gas.S is using stw on line 25 when it uses std farther down. Try changing stw to std and see if that works. Now I understand what you meant about it being suspect compared to the ELF version as those files use std for that instruction.

@NattyNarwhal
Copy link
Member Author

NattyNarwhal commented Aug 4, 2021

Doesn't seem to change it; PC is at the same place on crash site. I do note r3 (first arg reg) does seem a bit funny:

(gdb) break make_fcontext
Breakpoint 1 at 0x103eb6a0
(gdb) run
Starting program: /QOpenSys/pkgs/bin/php -d error_log= /home/CALVIN/fiber.php
[New Thread 1]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1]
0x00000001003eb67c in make_fcontext ()
(gdb) info reg
r0             0x3618   13848
r1             0xfffffffffffcf30        1152921504606834480
r2             0x1800c8020      6443270176 // stack
r3             0x430f08 4394760
r4             0x200000 2097152 // size
r5             0x1800baf28      6443216680 // trampoline

If I try to print out the memory there:

(gdb) x/32i 0x430f08
   0x430f08:    Cannot access memory at address 0x430f08

Hmmm. Looking at the code there,

ZEND_API bool zend_fiber_init_context(zend_fiber_context *context, void *kind, zend_fiber_coroutine coroutine, size_t stack_size)
{       
        context->stack = zend_fiber_stack_allocate(stack_size);
        
        if (UNEXPECTED(!context->stack)) {
                return false;
        }

#ifdef ZEND_FIBER_UCONTEXT 
        // [...]
#else   
        // Stack grows down, calculate the top of the stack. make_fcontext then shifts pointer to lower 16-byte boundary.
        void *stack = (void *) ((uintptr_t) context->stack->pointer + context->stack->size);
        
        context->handle = make_fcontext(stack, context->stack->size, zend_fiber_trampoline);
        ZEND_ASSERT(context->handle != NULL && "make_fcontext() never returns NULL");
#endif

Something in zend_fiber_stack_allocate seems wrong, because that's where ->pointer is set.

@trowski
Copy link
Member

trowski commented Aug 4, 2021

Doesn't seem to change it

Not terribly surprised – I assume the boost authors know what they're doing.

Something in zend_fiber_stack_allocate seems wrong, because that's where ->pointer is set.

context->stack->pointer is a pointer to the first address of a mmap'd region of memory that has not been protected. The pointer given to make_fcontext is moved to the top of the stack (highest address in the mmap'd region). Since the stack grows down, the first valid address need to be lower than that address (so try accessing an address 8-bytes lower than the value in 'r3'). Note of course this is assuming the address returned from mmap is valid. Perhaps something is going wrong there on this platform?

Try replacing the stack allocation to use PHP's memory allocator: replace the mmap logic in zend_fiber_stack_alloc here with pointer = emalloc(alloc_size); and the corresponding munmap in zend_fiber_stack_free here with efree(pointer).

@NattyNarwhal
Copy link
Member Author

Build with --enable-debug so gdb works less awful (AIX gdb with -O2 code is fucked to put it lightly):

(gdb) break zend_fibers.c:338
Breakpoint 2 at 0x1000655d4: file /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c, line 338.
(gdb) cont
Continuing.

Breakpoint 2, zend_fiber_init_context (context=0x70000000005e7c0, kind=0x1802c9c70, 
    coroutine=@0x1800fb878: 0x1000659b8 <zend_fiber_execute>, stack_size=2097152)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:339
339             context->handle = make_fcontext(stack, context->stack->size, zend_fiber_trampoline);
(gdb) info locals
stack = 0x700000000431000
(gdb) p stack
$1 = (void *) 0x700000000431000
(gdb) p context
$2 = (zend_fiber_context *) 0x70000000005e7c0
(gdb) p *context
$3 = {handle = 0x0, kind = 0x0, function = 0x0, stack = 0x700000000063120, status = ZEND_FIBER_STATUS_INIT}
(gdb) step
warning: (Internal error: pc 0x100068328 in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x100068328 in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x100068328 in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x100068328 in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x100068328 in read in psymtab, but not in symtab.)


Program received signal SIGSEGV, Segmentation fault.
0x000000010059c75c in make_fcontext ()
(gdb) info reg   
r0             0x100065580      4295382400
r1             0xfffffffffffcd60        1152921504606834016
r2             0x1801106e8      6443566824
r3             0x430f08 4394760
r4             0x200000 2097152
r5             0x1800fb860      6443481184

The returned pointer from mmap looks fine to me, but it gets mangled somewhere once it hits make_fcontext.

@NattyNarwhal
Copy link
Member Author

gdb doesn't want to put a breakpoint on the symbol (if I do, it just goes to the crashsite instead), but putting a BP onto the entry point's address works. I think the code from boost mangles the r3 value 🤨

(gdb) break *0x000000010059c750
Breakpoint 1 at 0x10059c750
(gdb) run
Starting program: /QOpenSys/pkgs/bin/php -d error_log= /home/CALVIN/fiber.php
[New Thread 1]
[Switching to Thread 1]

Breakpoint 1, 0x000000010059c750 in make_fcontext ()
(gdb) info reg
r0             0x100065580      4295382400
r1             0xfffffffffffcd60        1152921504606834016
r2             0x1801106e8      6443566824
r3             0x700000000431000        504403158269890560
r4             0x200000 2097152
r5             0x1800fb860      6443481184
r6             0xdd     221
r7             0x0      0
r8             0x700000000000040        504403158265495616
r9             0x200000 2097152
r10            0x700000000231000        504403158267793408
r11            0xb00fd00004912f30       12686587373821177648
r12            0x800000000000f032       9223372036854837298
r13            0x1801452c0      6443782848
r14            0x700000000016020        504403158265585696
r15            0x7000000000720c0        504403158265962688
r16            0xffffffffffffbd8        1152921504606845912
r17            0x800200140000000        576495942044221440
r18            0xffffffffffffed0        1152921504606846672
r19            0x9fffffff000c8a0        720575940110895264
r20            0xbadc0ffee0ddf00d       13464654573299691533
r21            0xbadc0ffee0ddf00d       13464654573299691533
r22            0xbadc0ffee0ddf00d       13464654573299691533
r23            0xbadc0ffee0ddf00d       13464654573299691533
r24            0xbadc0ffee0ddf00d       13464654573299691533
r25            0xbadc0ffee0ddf00d       13464654573299691533
r26            0xbadc0ffee0ddf00d       13464654573299691533
r27            0xbadc0ffee0ddf00d       13464654573299691533
r28            0xbadc0ffee0ddf00d       13464654573299691533
r29            0xbadc0ffee0ddf00d       13464654573299691533
r30            0xbadc0ffee0ddf00d       13464654573299691533
r31            0xfffffffffffcd60        1152921504606834016
pc             0x10059c750      0x10059c750 <make_fcontext>
msr            0x800000000002f032       9223372036854968370
cr             0x22000442       570426434
lr             0x1000655f0      0x1000655f0 <zend_fiber_init_context+156>
ctr            0x0      0
xer            0x4000000        67108864
(gdb) disas
Dump of assembler code for function make_fcontext:
=> 0x000000010059c750 <+0>:     mflr    r6
   0x000000010059c754 <+4>:     rlwinm  r3,r3,0,0,27
   0x000000010059c758 <+8>:     addi    r3,r3,-248
   0x000000010059c75c <+12>:    std     r5,176(r3)
   0x000000010059c760 <+16>:    li      r0,0
   0x000000010059c764 <+20>:    std     r0,184(r3)
   0x000000010059c768 <+24>:    addi    r0,r3,232
   0x000000010059c76c <+28>:    mr      r4,r0
   0x000000010059c770 <+32>:    std     r4,152(r3)
   0x000000010059c774 <+36>:    mflr    r0
   0x000000010059c778 <+40>:    bl      0x10059c77c <make_fcontext+44>
   0x000000010059c77c <+44>:    mflr    r4
   0x000000010059c780 <+48>:    addi    r4,r4,24
   0x000000010059c784 <+52>:    mtlr    r0
   0x000000010059c788 <+56>:    std     r4,168(r3)
   0x000000010059c78c <+60>:    mtlr    r6
   0x000000010059c790 <+64>:    blr
   0x000000010059c794 <+68>:    mflr    r0
   0x000000010059c798 <+72>:    std     r0,8(r1)
   0x000000010059c79c <+76>:    stdu    r1,-32(r1)
   0x000000010059c7a0 <+80>:    li      r3,0
   0x000000010059c7a4 <+84>:    bl      0x100400ed4 <_exit>
   0x000000010059c7a8 <+88>:    ld      r2,40(r1)
End of assembler dump.

@NattyNarwhal
Copy link
Member Author

(gdb) stepi
0x000000010059c754 in make_fcontext ()
$2 = (void *) 0x700000000431000
(gdb) stepi
0x000000010059c758 in make_fcontext ()
(gdb) p (void*)$r3
$3 = (void *) 0x431000
(gdb) disassemble 
Dump of assembler code for function make_fcontext:
   0x000000010059c750 <+0>:     mflr    r6
   0x000000010059c754 <+4>:     rlwinm  r3,r3,0,0,27
=> 0x000000010059c758 <+8>:     addi    r3,r3,-248

I think it's the rlwinm instruction; what is it even trying to accomplish here?

@NattyNarwhal
Copy link
Member Author

Well...

    # first arg of make_fcontext() == top address of context-function
    # shift address in R3 to lower 16 byte boundary
    clrrwi  3, 3, 4

Ugh, should that be clrrdi instead? Looks like it just masks out the bottom bits, but it's clobbering the top bits with it. How could this have ever worked? (I even ran the test suite for boost::context.... or at least tried, anyways. All but one test passed, and it was unrelated to this....)

@trowski
Copy link
Member

trowski commented Aug 4, 2021

Ugh, should that be clrrdi instead?

Intuitively I would expect all the word instructions to need to be double-word on 64-bit. I guess try changing both stw and clrrwi to std and clrrdi.

@NattyNarwhal
Copy link
Member Author

Using doubleword instructions, that seems to fix the address error, but the code seems unaware of function descriptors (in AIX/ELFv1 ABI, it's IIRC a pc/r2/"env" tuple); it's jumping right to one. The code in Boost explicitly states it's for AIX, so I'm not sure why it's so blatantly wrong.

(gdb) info reg
r0             0x100065580      4295382400
r1             0xfffffffffffcd60        1152921504606834016
r2             0x1801106e8      6443566824
r3             0x700000000431000        504403158269890560
r4             0x200000 2097152
r5             0x1800fb860      6443481184
[...]
(gdb) p *(void**)$r5
$2 = (void *) 0x100065310 <zend_fiber_trampoline>
[...continue after BP in make_fcontext...]
(gdb) where
#0  0x00000001800fb860 in __jit_debug_descriptor ()
#1  0x000000010059c794 in make_fcontext ()

@NattyNarwhal
Copy link
Member Author

(Ironically, the code in the ppc64 ELFv1 case might be able to handle it because again, 99% identical ABI, just needs adapting to IBM assembler...)

@trowski
Copy link
Member

trowski commented Aug 4, 2021

How did the address to __jit_debug_descriptor end up in r5? That makes no sense…

@NattyNarwhal
Copy link
Member Author

It's probably not - GDB probably considers it to be because it's the last known symbol in php's .text until the function descriptor no man's land in .data.

i.e. you can see that func isn't ~82k instructions long

(gdb) x/32gx $pc - 32
0x1800fb840 <__jit_debug_descriptor+82192>:     0x0000000000000000      0x0000000100067f44
0x1800fb850 <__jit_debug_descriptor+82208>:     0x00000001801106e8      0x0000000000000000
0x1800fb860 <__jit_debug_descriptor+82224>:     0x0000000100065310      0x00000001801106e8
0x1800fb870 <__jit_debug_descriptor+82240>:     0x0000000000000000      0x00000001000659b8
0x1800fb880 <__jit_debug_descriptor+82256>:     0x00000001801106e8      0x0000000000000000
0x1800fb890 <__jit_debug_descriptor+82272>:     0x00000001000661e8      0x00000001801106e8
0x1800fb8a0 <__jit_debug_descriptor+82288>:     0x0000000000000000      0x00000001000662a4
0x1800fb8b0 <__jit_debug_descriptor+82304>:     0x00000001801106e8      0x0000000000000000
0x1800fb8c0 <__jit_debug_descriptor+82320>:     0x0000000100066484      0x00000001801106e8
0x1800fb8d0 <__jit_debug_descriptor+82336>:     0x0000000000000000      0x0000000100066524
0x1800fb8e0 <__jit_debug_descriptor+82352>:     0x00000001801106e8      0x0000000000000000
0x1800fb8f0 <__jit_debug_descriptor+82368>:     0x000000010005c134      0x00000001801106e8
0x1800fb900 <__jit_debug_descriptor+82384>:     0x0000000000000000      0x000000010005bf40
0x1800fb910 <__jit_debug_descriptor+82400>:     0x00000001801106e8      0x0000000000000000
0x1800fb920 <__jit_debug_descriptor+82416>:     0x000000010001d7e4      0x00000001801106e8
0x1800fb930 <__jit_debug_descriptor+82432>:     0x0000000000000000      0x0000000100063e2c

@trowski
Copy link
Member

trowski commented Aug 4, 2021

GDB probably considers it to be because it's the last known symbol in php's .text until the function descriptor no man's land in .data.

That makes sense, thanks for the explanation.

I wonder then if stw is correct for storing function pointers, but clrrdi is required for the high address returned from mmap. Sort of grasping at straws here, but I'm running out of ideas. Worth a shot I guess.

@NattyNarwhal
Copy link
Member Author

I think stw would truncate it to 32-bits which might be bad; I think we're missing the TOC storage (r2) in make and loading in jump, comparing with the ELF _CALL_ELF==1 case.

@trowski
Copy link
Member

trowski commented Aug 4, 2021

I think stw would truncate it to 32-bits which might be bad

I would think so too, but the 64-bit PPC assembly for mac also uses stw, so I assume that's correct.

I think we're missing the TOC storage (r2) in make and loading in jump, comparing with the ELF _CALL_ELF==1 case.

Seems like a reasonable thing to try I guess, curious if that makes a difference.

@NattyNarwhal
Copy link
Member Author

Getting closer; we no longer barf in make_fcontext, but now hit the first assert in zend_fiber_execute:

(gdb) run
Starting program: /QOpenSys/pkgs/bin/php -d error_log= /home/CALVIN/fiber.php
[New Thread 1]
Assertion failed: __EX, file  /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c, line 437

Program received signal SIGABRT, Aborted.
[Switching to Thread 1]
0x0900000000294030 in pthread_kill () from /QOpenSys/usr/lib/libpthreads.a(shr_xpg5_64.o)
(gdb) frame 5
#5  0x00000001000659fc in zend_fiber_execute (transfer=0x700000000430f90) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:437
437             ZEND_ASSERT(Z_TYPE(transfer->value) == IS_NULL && "Initial transfer value to fiber context must be NULL");
(gdb) p transfer
$1 = (zend_fiber_transfer *) 0x700000000430f90
(gdb) p *transfer
$2 = {context = 0x1801106e8 <__jit_debug_descriptor+167864>, value = {value = {lval = 504403158269890496, dval = 5.7766220084047076e-275, counted = 0x700000000430fc0, str = 0x700000000430fc0, arr = 0x700000000430fc0, 
      obj = 0x700000000430fc0, res = 0x700000000430fc0, ref = 0x700000000430fc0, ast = 0x700000000430fc0, zv = 0x700000000430fc0, ptr = 0x700000000430fc0, ce = 0x700000000430fc0, func = 0x700000000430fc0, ww = {w1 = 117440512, 
        w2 = 4394944}}, u1 = {type_info = 0, v = {u = {extra = 0}, type_flags = 0 '\000', type = 0 '\000'}}, u2 = {next = 0, cache_slot = 0, opline_num = 0, lineno = 0, num_args = 0, fe_pos = 0, fe_iter_idx = 0, property_guard = 0, 
      constant_flags = 0, extra = 0}}, flags = 0 '\000'}
(gdb) p transfer->value
$3 = {value = {lval = 504403158269890496, dval = 5.7766220084047076e-275, counted = 0x700000000430fc0, str = 0x700000000430fc0, arr = 0x700000000430fc0, obj = 0x700000000430fc0, res = 0x700000000430fc0, ref = 0x700000000430fc0, 
    ast = 0x700000000430fc0, zv = 0x700000000430fc0, ptr = 0x700000000430fc0, ce = 0x700000000430fc0, func = 0x700000000430fc0, ww = {w1 = 117440512, w2 = 4394944}}, u1 = {type_info = 0, v = {u = {extra = 0}, type_flags = 0 '\000', 
      type = 0 '\000'}}, u2 = {next = 0, cache_slot = 0, opline_num = 0, lineno = 0, num_args = 0, fe_pos = 0, fe_iter_idx = 0, property_guard = 0, constant_flags = 0, extra = 0}}

@NattyNarwhal
Copy link
Member Author

Seems like jump_fcontext needs fixing next:

(gdb) break *0x000000010059c7b8
Breakpoint 3 at 0x10059c7b8
(gdb) info reg
r0             0x100066c18      4295388184
r1             0xfffffffffffcd60        1152921504606834016
r2             0x1801106e8      6443566824
r3             0x70000000005e7c0        504403158265882560
r4             0x1802c9c70      6445374576
r5             0x1800fb878      6443481208
r6             0x200000 2097152
r7             0x70000000005e780        504403158265882496
r8             0x1802c9c70      6445374576
r9             0x200000 2097152
r10            0x70000000005e7c0        504403158265882560
r11            0x0      0
r12            0x70000000006f418        504403158265951256
r13            0x1801452c0      6443782848
r14            0x700000000016020        504403158265585696
r15            0x7000000000720c0        504403158265962688
r16            0xffffffffffffbd8        1152921504606845912
r17            0x800200140000000        576495942044221440
r18            0xffffffffffffed0        1152921504606846672
r19            0x9fffffff000c8a0        720575940110895264
r20            0xbadc0ffee0ddf00d       13464654573299691533
r21            0xbadc0ffee0ddf00d       13464654573299691533
r22            0xbadc0ffee0ddf00d       13464654573299691533
r23            0xbadc0ffee0ddf00d       13464654573299691533
r24            0xbadc0ffee0ddf00d       13464654573299691533
r25            0xbadc0ffee0ddf00d       13464654573299691533
r26            0xbadc0ffee0ddf00d       13464654573299691533
r27            0xbadc0ffee0ddf00d       13464654573299691533
r28            0xbadc0ffee0ddf00d       13464654573299691533
r29            0xbadc0ffee0ddf00d       13464654573299691533
r30            0xbadc0ffee0ddf00d       13464654573299691533
r31            0xfffffffffffcd60        1152921504606834016
pc             0x100065578      0x100065578 <zend_fiber_init_context+36>
msr            0x800000000002f032       9223372036854968370
cr             0x22000442       570426434
lr             0x100066c18      0x100066c18 <zim_Fiber_start+788>
ctr            0x100066904      4295387396
xer            0x34000010       872415248
(gdb) cont
Continuing.

Breakpoint 3, 0x000000010059c7b8 in jump_fcontext ()
(gdb) where
#0  0x000000010059c7b8 in jump_fcontext ()
#1  0x00000001000658fc in zend_fiber_switch_context (transfer=0xfffffffffffcd10) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:408
#2  0x0000000100065f9c in zend_fiber_switch_to (context=0x70000000005e7c0, value=0x0, exception=false) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:530
#3  0x00000001000660b0 in zend_fiber_resume (fiber=0x70000000005e780, value=0x0, exception=false) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:547
#4  0x0000000100066c7c in zim_Fiber_start (execute_data=0x700000000016100, return_value=0x7000000000160d0) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:675
#5  0x000000010009dc7c in ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER () at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_vm_execute.h:1864
#6  0x0000000100135eec in execute_ex (ex=0x700000000016020) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_vm_execute.h:54507
#7  0x000000010013a6ac in zend_execute (op_array=0x70000000005e640, return_value=0x0) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_vm_execute.h:58834
#8  0x0000000100014818 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend.c:1789
#9  0x00000001001fbd34 in php_execute_script (primary_file=0xfffffffffffefb0) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/main/main.c:2517
#10 0x0000000100003250 in do_cli (argc=4, argv=0x1801450d0) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/sapi/cli/php_cli.c:965
#11 0x00000001000042a8 in main (argc=4, argv=0x1801450d0) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/sapi/cli/php_cli.c:1367
(gdb) info reg
r0             0x100065880      4295383168
r1             0xfffffffffffcba0        1152921504606833568
r2             0x1801106e8      6443566824
r3             0xfffffffffffcc60        1152921504606833760
r4             0x700000000430f08        504403158269890312
r5             0xfffffffffffcd10        1152921504606833936
r6             0x0      0
r7             0x0      0
r8             0x0      0
r9             0xfffffffffffcc60        1152921504606833760
r10            0x700000000430f08        504403158269890312
r11            0xb010f00003aacf30       12686904033154879280
r12            0x800000000280f032       9223372036896780338
r13            0x1801452c0      6443782848
r14            0x700000000016020        504403158265585696
r15            0x7000000000720c0        504403158265962688
r16            0xffffffffffffbd8        1152921504606845912
r17            0x800200140000000        576495942044221440
r18            0xffffffffffffed0        1152921504606846672
r19            0x9fffffff000c8a0        720575940110895264
r20            0xbadc0ffee0ddf00d       13464654573299691533
r21            0xbadc0ffee0ddf00d       13464654573299691533
r22            0xbadc0ffee0ddf00d       13464654573299691533
r23            0xbadc0ffee0ddf00d       13464654573299691533
r24            0xbadc0ffee0ddf00d       13464654573299691533
r25            0xbadc0ffee0ddf00d       13464654573299691533
r26            0xbadc0ffee0ddf00d       13464654573299691533
r27            0xbadc0ffee0ddf00d       13464654573299691533
r28            0xbadc0ffee0ddf00d       13464654573299691533
r29            0xbadc0ffee0ddf00d       13464654573299691533
r30            0xbadc0ffee0ddf00d       13464654573299691533
r31            0xfffffffffffcba0        1152921504606833568
pc             0x10059c7b8      0x10059c7b8 <jump_fcontext>
msr            0x800000000282f032       9223372036896911410
cr             0x22000444       570426436
lr             0x1000658fc      0x1000658fc <zend_fiber_switch_context+500>
ctr            0x0      0
xer            0x4000000        67108864
(gdb) cont
Continuing.

Breakpoint 1, zend_fiber_execute (transfer=0x700000000430f90) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta1/Zend/zend_fibers.c:437
437             ZEND_ASSERT(Z_TYPE(transfer->value) == IS_NULL && "Initial transfer value to fiber context must be NULL");
(gdb) info reg
r0             0x1000653b0      4295381936
r1             0x700000000430bb0        504403158269889456
r2             0x1801106e8      6443566824
r3             0x700000000430f90        504403158269890448
r4             0x700000000430f08        504403158269890312
r5             0xfffffffffffcd10        1152921504606833936
r6             0xfffffffffffcae8        1152921504606833384
r7             0x1801106e8      6443566824
r8             0x700000000430fc0        504403158269890496
r9             0x1800fb878      6443481208
r10            0x1000659b8      4295383480
r11            0x0      0
r12            0x800000000280f032       9223372036896780338
r13            0x1801106e8      6443566824
r14            0x0      0
r15            0x0      0
r16            0x0      0
r17            0x0      0
r18            0x0      0
r19            0x0      0
r20            0x0      0
r21            0x0      0
r22            0x0      0
r23            0x0      0
r24            0x0      0
r25            0x0      0
r26            0x0      0
r27            0x0      0
r28            0x0      0
r29            0x0      0
r30            0x0      0
r31            0x700000000430bb0        504403158269889456
pc             0x1000659d0      0x1000659d0 <zend_fiber_execute+24>
msr            0x800000000282f032       9223372036896911410
cr             0x8      8
lr             0x1000653b0      0x1000653b0 <zend_fiber_trampoline+160>
ctr            0x1000659b8      4295383480
xer            0x4000000        67108864

The SP seems clobbered.

@NattyNarwhal
Copy link
Member Author

I filed an issue on boostorg/context#180 because the assembly not matching the ABI at all is confusing. There has to be a better explanation...

@trowski
Copy link
Member

trowski commented Aug 7, 2021

Seems there's a lot more logic around copying the transfer_t pointer (i.e., boost_context_data) to the appropriate registers in the ELF version as compared to the XCOFF version, including a note about the first call to jump_fcontext.

@NattyNarwhal
Copy link
Member Author

Yup, seems like making those adaptations will be needed.

I'm also going to see what code GCC will emit in a smaller, isolated case from C, which I can reason with more.

@NattyNarwhal
Copy link
Member Author

OK, I'm once again comparing the elf/xcoff versions and I'm not sure where you're seeing where jump_fcontext in the ELF version handles args being bumped for the struct pointer in r3. I do notice "zero in r3 indicates first jump to context-function" not in the XCOFF version, but I can't think that'd be it? Otherwise, they seem the same outside of r2.

I might need to get access to a Linux box to compare this on...

@trowski
Copy link
Member

trowski commented Aug 9, 2021

I was referring to the branch on r3 being 0 that determines how the copies between r5 and r6 are handled.

I put the following together quickly based on comparing the ELF and XCOFF versions. Not sure if my syntax is correct on the label and branch, but it demonstrates the differences I was referring to with !! ADDED comments.

/*
            Copyright Oliver Kowalke 2009.
   Distributed under the Boost Software License, Version 1.0.
      (See accompanying file LICENSE_1_0.txt or copy at
          http://www.boost.org/LICENSE_1_0.txt)
*/

.align 2
.globl .jump_fcontext
.jump_fcontext:
    # reserve space on stack
    subi  1, 1, 184

    std  2, 0(1)   # save TOC !! ADDED

    std  13, 0(1)  # save R13
    std  14, 8(1)  # save R14
    std  15, 16(1)  # save R15
    std  16, 24(1)  # save R16
    std  17, 32(1)  # save R17
    std  18, 40(1)  # save R18
    std  19, 48(1)  # save R19
    std  20, 56(1)  # save R20
    std  21, 64(1)  # save R21
    std  22, 72(1)  # save R22
    std  23, 80(1)  # save R23
    std  24, 88(1)  # save R24
    std  25, 96(1)  # save R25
    std  26, 104(1)  # save R26
    std  27, 112(1)  # save R27
    std  29, 120(1)  # save R28
    std  29, 128(1)  # save R29
    std  30, 136(1)  # save R30
    std  31, 144(1)  # save R31
    std  3,  152(1)  # save hidden

    # save CR
    mfcr  0
    std  0, 160(1)
    # save LR
    mflr  0
    std  0, 168(1)
    # save LR as PC
    std  0, 176(1)

    # store RSP (pointing to context-data) in R6
    mr  6, 1

    # restore RSP (pointing to context-data) from R4
    mr  1, 4

    ld  2,  0(1)  # restore TOC !! ADDED

    ld  13, 0(1)  # restore R13
    ld  14, 8(1)  # restore R14
    ld  15, 16(1)  # restore R15
    ld  16, 24(1)  # restore R16
    ld  17, 32(1)  # restore R17
    ld  18, 40(1)  # restore R18
    ld  19, 48(1)  # restore R19
    ld  20, 56(1)  # restore R20
    ld  21, 64(1)  # restore R21
    ld  22, 72(1)  # restore R22
    ld  23, 80(1)  # restore R23
    ld  24, 88(1)  # restore R24
    ld  25, 96(1)  # restore R25
    ld  26, 104(1)  # restore R26
    ld  27, 112(1)  # restore R27
    ld  28, 120(1)  # restore R28
    ld  29, 128(1)  # restore R29
    ld  30, 136(1)  # restore R30
    ld  31, 144(1)  # restore R31
    ld  3,  152(1)  # restore hidden

    # restore CR
    ld  0, 160(1)
    mtcr  0
    # restore LR
    ld  0, 168(1)
    mtlr  0

    # load PC
    ld  0, 176(1)
    # restore CTR
    mtctr  0

    # adjust stack
    addi  1, 1, 184

    cmpdi 3, 0 # !! ADDED
    beq .use_entry_arg # !! ADDED

    # return transfer_t
    std  6, 0(3)
    std  5, 8(3)

    # jump to context
    bctr

.use_entry_arg: # !! ADDED
    # copy transfer_t into transfer_fn arg registers
    mr  3, 6
    mr  4, 5

    # jump to context
    bctr

The mac version has this similar logic, perhaps it would be helpful.

@NattyNarwhal
Copy link
Member Author

Ah, that'd be the changes I was porting after comparing both. I thought you meant something further above in i.e. the load/store multiple registers clumps.

@trowski
Copy link
Member

trowski commented Aug 9, 2021

There is the store on r2 that is in the ELF == 1 version in case you missed it. Not sure if I missed any other differences. If that doesn't work I'm at a loss without a system to test on.

@NattyNarwhal
Copy link
Member Author

I didn't try changing the store/load TOC stuff yet but I think that needs to be instead of the r13, since they occupy the same space on the stack and the calling convention docs seem to indicate it's reserved. I'm building with adding the branch, and if that doesn't work, changing the register loaded there.

@NattyNarwhal
Copy link
Member Author

Yeah, after your suggested changes (I'll push those), I don't see any change.

If you're interested I can get you access to a box. Likewise, I'll try to get access to a Linux box, because this behaviour doesn't make sense if we're basically exactly matching it now and the ABIs are shared.

@trowski
Copy link
Member

trowski commented Aug 9, 2021

I think that needs to be instead of the r13,

D'oh, yeah, missed removing the store of r13, so that would squash that anyway. Assuming you fixed that before trying?

If you're interested I can get you access to a box.

Sure, I'd be willing to mess around a bit. Not entirely sure when I'll get time for it though, but if you can leave the time period open-ended I'll give it try.

@NattyNarwhal
Copy link
Member Author

Already did. I'll try to arrange an account for you; thankfully, there's a workaround (ucontext_t) at least, so it's not too urgent.

@NattyNarwhal
Copy link
Member Author

I did notice an inconsistency between make_fcontext; I'm seeing if that changes anything...

@NattyNarwhal
Copy link
Member Author

FINALLY (using the example from RFC, changed for current API)

$ php -d error_log= ~/fiber.php 
Waiting for data...
Received data: Hello, world!

@trowski
Copy link
Member

trowski commented Aug 10, 2021

Awesome! Try running the fiber tests. make test TESTS="Zend/tests/fibers"

@NattyNarwhal
Copy link
Member Author

First pass

PASS Backtrace in deeply nested function call [Zend/tests/fibers/backtrace-deep-nesting.phpt] 
PASS Backtrace in nested function call [Zend/tests/fibers/backtrace-nested.phpt] 
PASS Backtrace in with object as fiber callback [Zend/tests/fibers/backtrace-object.phpt] 
PASS Catch exception thrown into fiber, then suspend again [Zend/tests/fibers/catch-then-suspend.phpt] 
FAIL Catch exception thrown into fiber [Zend/tests/fibers/catch.phpt] 
PASS Print backtrace in fiber [Zend/tests/fibers/debug-backtrace.phpt] 
PASS Start on already running fiber [Zend/tests/fibers/double-start.phpt] 
FAIL Error reporting change reflected inside fiber [Zend/tests/fibers/error-reporting.phpt] 
PASS Exit from fiber [Zend/tests/fibers/exit-in-fiber.phpt] 
PASS Test throwing from fiber [Zend/tests/fibers/failing-fiber.phpt] 
FAIL Test throwing from fiber [Zend/tests/fibers/failing-nested-fiber.phpt] 
PASS Fast finishing fiber does not leak [Zend/tests/fibers/fast-finish-fiber.phpt] 
FAIL Fatal error in new fiber [Zend/tests/fibers/fatal-error-in-fiber.phpt] 
FAIL Fatal error within a nested fiber [Zend/tests/fibers/fatal-error-in-nested-fiber.phpt] 
FAIL Fatal error in a fiber with other active fibers [Zend/tests/fibers/fatal-error-with-multiple-fibers.phpt] 
PASS FiberError cannot be constructed in user code [Zend/tests/fibers/fiber-error-construct.phpt] 
FAIL Fiber::getCurrent() [Zend/tests/fibers/fiber-get-current.phpt] 
PASS Fiber in shutdown function [Zend/tests/fibers/fiber-in-shutdown-function.phpt] 
FAIL Fiber status methods [Zend/tests/fibers/fiber-status.phpt] 
PASS GC can cleanup cycle when callback references fiber [Zend/tests/fibers/gc-cycle-callback.phpt] 
PASS GC can cleanup cycle when fiber result references fiber [Zend/tests/fibers/gc-cycle-result.phpt] 
FAIL Fiber::getReturn() after bailout [Zend/tests/fibers/get-return-after-bailout.phpt] 
PASS Fiber::getReturn() after a fiber throws [Zend/tests/fibers/get-return-after-throwing.phpt] 
PASS Fiber::getReturn() from unstarted fiber [Zend/tests/fibers/get-return-from-unstarted-fiber.phpt] 
PASS Fiber::getReturn() in unfinished fiber [Zend/tests/fibers/get-return-in-unfinished-fiber.phpt] 
PASS Test fiber return value [Zend/tests/fibers/get-return.phpt] 
FAIL Reference to invocable class retained while running [Zend/tests/fibers/invocable-class.phpt] 
PASS Cannot resume fiber within destructor [Zend/tests/fibers/no-switch-dtor-resume.phpt] 
PASS Cannot start fiber within destructor [Zend/tests/fibers/no-switch-dtor-start.phpt] 
PASS Cannot suspend fiber within destructor [Zend/tests/fibers/no-switch-dtor-suspend.phpt] 
PASS Cannot resume fiber within destructor [Zend/tests/fibers/no-switch-dtor-throw.phpt] 
PASS Cannot start a new fiber in a finally block in a force-closed fiber [Zend/tests/fibers/no-switch-force-close-finally.phpt] 
PASS Context switches are prevented during GC collect cycles [Zend/tests/fibers/no-switch-gc.phpt] 
FAIL Out of Memory in a fiber [Zend/tests/fibers/out-of-memory-in-fiber.phpt] 
FAIL Out of Memory in a nested fiber [Zend/tests/fibers/out-of-memory-in-nested-fiber.phpt] 
FAIL Out of Memory from recursive fiber creation [Zend/tests/fibers/out-of-memory-in-recursive-fiber.phpt] 
PASS Resume non-running fiber [Zend/tests/fibers/resume-non-running-fiber.phpt] 
PASS Resume previous fiber [Zend/tests/fibers/resume-previous-fiber.phpt] 
PASS Resume running fiber [Zend/tests/fibers/resume-running-fiber.phpt] 
PASS Resume terminated fiber [Zend/tests/fibers/resume-terminated-fiber.phpt] 
FAIL Test resume [Zend/tests/fibers/resume.phpt] 
PASS Fiber function may return by ref, but getReturn() always returns by val [Zend/tests/fibers/return-by-ref.phpt] 
FAIL Silence operator does not leak out of fiber [Zend/tests/fibers/silence-operator-inside-fiber.phpt] 
FAIL Silence operator does not leak into fiber [Zend/tests/fibers/silence-operator-outside-fiber.phpt] 
FAIL Arguments to fiber callback [Zend/tests/fibers/start-arguments.phpt] 
PASS Suspend in force-closed fiber after shutdown [Zend/tests/fibers/suspend-in-force-close-fiber-after-shutdown.phpt] 
PASS Suspend in force-closed fiber, catching exception thrown from destructor [Zend/tests/fibers/suspend-in-force-close-fiber-catching-exception.phpt] 
PASS Suspend in force-closed fiber [Zend/tests/fibers/suspend-in-force-close-fiber.phpt] 
PASS Suspend within nested function call [Zend/tests/fibers/suspend-in-nested-function.phpt] 
PASS Suspend outside fiber [Zend/tests/fibers/suspend-outside-fiber.phpt] 
PASS Make sure exceptions are rethrown when throwing from fiber destructor [Zend/tests/fibers/throw-during-fiber-destruct.phpt] 
PASS Throw in multiple destroyed fibers after shutdown [Zend/tests/fibers/throw-in-multiple-destroyed-fibers-after-shutdown.phpt] 
PASS Throw into non-running fiber [Zend/tests/fibers/throw-into-non-running-fiber.phpt] 
PASS Test throwing into fiber [Zend/tests/fibers/throw.phpt] 
PASS Test unfinished fiber with finally block [Zend/tests/fibers/unfinished-fiber-with-finally.phpt] 
PASS Test unfinished fiber with nested try/catch blocks [Zend/tests/fibers/unfinished-fiber-with-nested-try-catch.phpt] 
PASS Test unfinished fiber with suspend in finally [Zend/tests/fibers/unfinished-fiber-with-suspend-in-finally.phpt] 
PASS Test unfinished fiber with suspend in finally [Zend/tests/fibers/unfinished-fiber-with-throw-in-finally.phpt] 
PASS Test unfinished fiber [Zend/tests/fibers/unfinished-fiber.phpt] 
PASS Not starting a fiber does not leak [Zend/tests/fibers/unstarted-fiber.phpt] 
=====================================================================
Number of tests :   60                60
Tests skipped   :    0 (  0.0%) --------
Tests warned    :    0 (  0.0%) (  0.0%)
Tests failed    :   17 ( 28.3%) ( 28.3%)
Tests passed    :   43 ( 71.7%) ( 71.7%)
---------------------------------------------------------------------
Time taken      :   23 seconds
=====================================================================

=====================================================================
FAILED TEST SUMMARY
---------------------------------------------------------------------
Catch exception thrown into fiber [Zend/tests/fibers/catch.phpt]
Error reporting change reflected inside fiber [Zend/tests/fibers/error-reporting.phpt]
Test throwing from fiber [Zend/tests/fibers/failing-nested-fiber.phpt]
Fatal error in new fiber [Zend/tests/fibers/fatal-error-in-fiber.phpt]
Fatal error within a nested fiber [Zend/tests/fibers/fatal-error-in-nested-fiber.phpt]
Fatal error in a fiber with other active fibers [Zend/tests/fibers/fatal-error-with-multiple-fibers.phpt]
Fiber::getCurrent() [Zend/tests/fibers/fiber-get-current.phpt]
Fiber status methods [Zend/tests/fibers/fiber-status.phpt]
Fiber::getReturn() after bailout [Zend/tests/fibers/get-return-after-bailout.phpt]
Reference to invocable class retained while running [Zend/tests/fibers/invocable-class.phpt]
Out of Memory in a fiber [Zend/tests/fibers/out-of-memory-in-fiber.phpt]
Out of Memory in a nested fiber [Zend/tests/fibers/out-of-memory-in-nested-fiber.phpt]
Out of Memory from recursive fiber creation [Zend/tests/fibers/out-of-memory-in-recursive-fiber.phpt]
Test resume [Zend/tests/fibers/resume.phpt]
Silence operator does not leak out of fiber [Zend/tests/fibers/silence-operator-inside-fiber.phpt]
Silence operator does not leak into fiber [Zend/tests/fibers/silence-operator-outside-fiber.phpt]
Arguments to fiber callback [Zend/tests/fibers/start-arguments.phpt]
=====================================================================

@NattyNarwhal
Copy link
Member Author

NattyNarwhal commented Aug 10, 2021

Looking, all of the failing tests seem to be crashes; oddly, the ones I've tried all seem to be in islower from var_dump and friends. At the crash site, the value of r3 seems to be the beginning of a string, i.e. 0x2064657269766564

@NattyNarwhal
Copy link
Member Author

Forgot to include a backtrace; here's error-reporting.php; it's an islower on the first char of fmt.

(gdb) where
#0  0x09000000003bdb50 in islower () from /QOpenSys/usr/lib/libc.a(shr_64.o)
#1  0x000000010059bc64 in xbuf_format_converter (xbuf=0x700000000430490, is_char=false, fmt=0x10065cf01 <@FIX19+63257> "s", 
    ap=0x700000000430620 "\a") at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/main/spprintf.c:241
#2  0x000000010059d52c in php_printf_to_smart_str (buf=0x700000000430490, format=0x10065cf00 <@FIX19+63256> "%s", 
    ap=0x700000000430620 "\a") at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/main/spprintf.c:780
#3  0x00000001000104e4 in zend_vstrpprintf (max_len=0, format=0x10065cf00 <@FIX19+63256> "%s", ap=0x700000000430620 "\a")
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend.c:264
#4  0x0000000100013870 in zend_error_va_list (orig_type=1024, error_filename=0x70000000007a000, error_lineno=6, 
    format=0x10065cf00 <@FIX19+63256> "%s", args=0x700000000430620 "\a")
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend.c:1498
#5  0x0000000100013c28 in zend_error (type=1024, format=0x10065cf00 <@FIX19+63256> "%s")
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend.c:1566
#6  0x00000001001b27f4 in zif_trigger_error (execute_data=0x70000000007c0f0, return_value=0x700000000430718)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend_builtin_functions.c:1129
#7  0x000000010009c440 in ZEND_DO_ICALL_SPEC_RETVAL_UNUSED_HANDLER ()
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend_vm_execute.h:1235
#8  0x00000001001362bc in execute_ex (ex=0x70000000007c070)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend_vm_execute.h:54456
#9  0x0000000100159900 in zend_call_function (fci=0x70000000006f528, fci_cache=0x70000000006f568)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend_execute_API.c:885
#10 0x0000000100065d14 in zend_fiber_execute (transfer=0x700000000430f90)
    at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend_fibers.c:475
#11 0x00000001000654d0 in zend_fiber_trampoline (data=...) at /home/calvin/rpmbuild/BUILD/php-8.1.0beta2/Zend/zend_fibers.c:287
#12 0x000000010059dcd8 in make_fcontext ()

@trowski
Copy link
Member

trowski commented Aug 11, 2021

I suspect that the pointer to zend_fiber_transfer is somehow getting borked. The failing tests appear to depend on using the zval that is transferred there.

Matches the Linux ELFv1 case, and might resolve an issue with restoring from a fiber.
@NattyNarwhal
Copy link
Member Author

I agree it's getting borked, but I suspect it's affecting more than the zval, but also impacting other things; I noticed islower seems to try to load a cached LC_CTYPE related table and gets confused. I looked at the disassembly there and it does seem to be using r13 (which is reserved per the AIX ABI docs and thread per Linux one), which the XCOFF code saves and restores. I'll set that to r2 like _CALL_ELF==1 (since ELF code doesn't touch r13) and see what happens.

@NattyNarwhal
Copy link
Member Author

Much better!

Running selected tests.
PASS Backtrace in deeply nested function call [Zend/tests/fibers/backtrace-deep-nesting.phpt] 
PASS Backtrace in nested function call [Zend/tests/fibers/backtrace-nested.phpt] 
PASS Backtrace in with object as fiber callback [Zend/tests/fibers/backtrace-object.phpt] 
PASS Catch exception thrown into fiber, then suspend again [Zend/tests/fibers/catch-then-suspend.phpt] 
PASS Catch exception thrown into fiber [Zend/tests/fibers/catch.phpt] 
PASS Print backtrace in fiber [Zend/tests/fibers/debug-backtrace.phpt] 
PASS Start on already running fiber [Zend/tests/fibers/double-start.phpt] 
PASS Error reporting change reflected inside fiber [Zend/tests/fibers/error-reporting.phpt] 
PASS Exit from fiber [Zend/tests/fibers/exit-in-fiber.phpt] 
PASS Test throwing from fiber [Zend/tests/fibers/failing-fiber.phpt] 
PASS Test throwing from fiber [Zend/tests/fibers/failing-nested-fiber.phpt] 
PASS Fast finishing fiber does not leak [Zend/tests/fibers/fast-finish-fiber.phpt] 
PASS Fatal error in new fiber [Zend/tests/fibers/fatal-error-in-fiber.phpt] 
PASS Fatal error within a nested fiber [Zend/tests/fibers/fatal-error-in-nested-fiber.phpt] 
PASS Fatal error in a fiber with other active fibers [Zend/tests/fibers/fatal-error-with-multiple-fibers.phpt] 
PASS FiberError cannot be constructed in user code [Zend/tests/fibers/fiber-error-construct.phpt] 
PASS Fiber::getCurrent() [Zend/tests/fibers/fiber-get-current.phpt] 
PASS Fiber in shutdown function [Zend/tests/fibers/fiber-in-shutdown-function.phpt] 
PASS Fiber status methods [Zend/tests/fibers/fiber-status.phpt] 
PASS GC can cleanup cycle when callback references fiber [Zend/tests/fibers/gc-cycle-callback.phpt] 
PASS GC can cleanup cycle when fiber result references fiber [Zend/tests/fibers/gc-cycle-result.phpt] 
PASS Fiber::getReturn() after bailout [Zend/tests/fibers/get-return-after-bailout.phpt] 
PASS Fiber::getReturn() after a fiber throws [Zend/tests/fibers/get-return-after-throwing.phpt] 
PASS Fiber::getReturn() from unstarted fiber [Zend/tests/fibers/get-return-from-unstarted-fiber.phpt] 
PASS Fiber::getReturn() in unfinished fiber [Zend/tests/fibers/get-return-in-unfinished-fiber.phpt] 
PASS Test fiber return value [Zend/tests/fibers/get-return.phpt] 
PASS Reference to invocable class retained while running [Zend/tests/fibers/invocable-class.phpt] 
PASS Cannot resume fiber within destructor [Zend/tests/fibers/no-switch-dtor-resume.phpt] 
PASS Cannot start fiber within destructor [Zend/tests/fibers/no-switch-dtor-start.phpt] 
PASS Cannot suspend fiber within destructor [Zend/tests/fibers/no-switch-dtor-suspend.phpt] 
PASS Cannot resume fiber within destructor [Zend/tests/fibers/no-switch-dtor-throw.phpt] 
PASS Cannot start a new fiber in a finally block in a force-closed fiber [Zend/tests/fibers/no-switch-force-close-finally.phpt] 
PASS Context switches are prevented during GC collect cycles [Zend/tests/fibers/no-switch-gc.phpt] 
PASS Out of Memory in a fiber [Zend/tests/fibers/out-of-memory-in-fiber.phpt] 
PASS Out of Memory in a nested fiber [Zend/tests/fibers/out-of-memory-in-nested-fiber.phpt] 
PASS Out of Memory from recursive fiber creation [Zend/tests/fibers/out-of-memory-in-recursive-fiber.phpt] 
PASS Resume non-running fiber [Zend/tests/fibers/resume-non-running-fiber.phpt] 
PASS Resume previous fiber [Zend/tests/fibers/resume-previous-fiber.phpt] 
PASS Resume running fiber [Zend/tests/fibers/resume-running-fiber.phpt] 
PASS Resume terminated fiber [Zend/tests/fibers/resume-terminated-fiber.phpt] 
PASS Test resume [Zend/tests/fibers/resume.phpt] 
PASS Fiber function may return by ref, but getReturn() always returns by val [Zend/tests/fibers/return-by-ref.phpt] 
PASS Silence operator does not leak out of fiber [Zend/tests/fibers/silence-operator-inside-fiber.phpt] 
PASS Silence operator does not leak into fiber [Zend/tests/fibers/silence-operator-outside-fiber.phpt] 
PASS Arguments to fiber callback [Zend/tests/fibers/start-arguments.phpt] 
PASS Suspend in force-closed fiber after shutdown [Zend/tests/fibers/suspend-in-force-close-fiber-after-shutdown.phpt] 
PASS Suspend in force-closed fiber, catching exception thrown from destructor [Zend/tests/fibers/suspend-in-force-close-fiber-catching-exception.phpt] 
PASS Suspend in force-closed fiber [Zend/tests/fibers/suspend-in-force-close-fiber.phpt] 
PASS Suspend within nested function call [Zend/tests/fibers/suspend-in-nested-function.phpt] 
PASS Suspend outside fiber [Zend/tests/fibers/suspend-outside-fiber.phpt] 
PASS Make sure exceptions are rethrown when throwing from fiber destructor [Zend/tests/fibers/throw-during-fiber-destruct.phpt] 
PASS Throw in multiple destroyed fibers after shutdown [Zend/tests/fibers/throw-in-multiple-destroyed-fibers-after-shutdown.phpt] 
PASS Throw into non-running fiber [Zend/tests/fibers/throw-into-non-running-fiber.phpt] 
PASS Test throwing into fiber [Zend/tests/fibers/throw.phpt] 
PASS Test unfinished fiber with finally block [Zend/tests/fibers/unfinished-fiber-with-finally.phpt] 
PASS Test unfinished fiber with nested try/catch blocks [Zend/tests/fibers/unfinished-fiber-with-nested-try-catch.phpt] 
PASS Test unfinished fiber with suspend in finally [Zend/tests/fibers/unfinished-fiber-with-suspend-in-finally.phpt] 
PASS Test unfinished fiber with suspend in finally [Zend/tests/fibers/unfinished-fiber-with-throw-in-finally.phpt] 
PASS Test unfinished fiber [Zend/tests/fibers/unfinished-fiber.phpt] 
PASS Not starting a fiber does not leak [Zend/tests/fibers/unstarted-fiber.phpt] 
=====================================================================
Number of tests :   60                60
Tests skipped   :    0 (  0.0%) --------
Tests warned    :    0 (  0.0%) (  0.0%)
Tests failed    :    0 (  0.0%) (  0.0%)
Tests passed    :   60 (100.0%) (100.0%)
---------------------------------------------------------------------
Time taken      :   19 seconds
=====================================================================

I think it'll also be worth taking the changes to upstream Boost as well.

@trowski
Copy link
Member

trowski commented Aug 11, 2021

Fantastic! So essentially it was modifying the XCOFF version to be inline with the ELF version when _CALL_ELF==1?

I think it'll also be worth taking the changes to upstream Boost as well.

Absolutely. It's nice that we'll be able to give back to that project. 👍

@NattyNarwhal
Copy link
Member Author

Yes, there were basically a few spots I missed that I neglected to catch, basically.

@NattyNarwhal NattyNarwhal marked this pull request as ready for review August 11, 2021 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants